智能论文笔记

Reasoning About Causal Models With Infinitely Many Variables

Joseph Y. Halpern , Spencer Peters

分类：人工智能

2021-12-21

广义结构方程模型（GSEM）[Peters和Halpern 2021]，作为名称表明，结构方程模型（SEM）的概括。他们可以在不同的许多变量中处理（以及其他物种，这对于捕获动态系统至关重要。我们在GSEM中提供了一种声音和完整的Aximatizing，即哈珀[2000]为SEM提供的声音和完整的公理化的延伸。考虑到GSEM有助于澄清Halpern的公理捕获的属性。

translated by 谷歌翻译

Causal Modeling With Infinitely Many Variables

Spencer Peters , Joseph Y. Halpern

分类：人工智能

2021-12-16

结构方程式模型（SEM）可能是用于建模因果关系的最常用的框架。然而，正如我们所示，天真地将该框架延伸到无限的多个变量，例如，要为模型动态系统而导入几个问题。我们介绍GSEMS（广义SEM），灵活的SEM直接指定干预结果，其中（1）微分方程的系统可以以自然和直观的方式表示，（2）某些自然情况，不能由SEM表示，可以轻松表示，（3）SEM中实际因果关系的定义基本上没有变化。

translated by 谷歌翻译

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Steven H. Wang , Antoine Scardigli , Leonard Tang , Wei Chen , Dimitry Levkin , Anya Chen , Spencer Ball , Thomas Woodside , Oliver Zhang , Dan Hendrycks

分类：自然语言处理

2023-01-02

Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.

translated by 谷歌翻译

A System-Level View on Out-of-Distribution Data in Robotics

Rohan Sinha , Apoorva Sharma , Somrita Banerjee , Thomas Lew , Rachel Luo , Spencer M. Richards , Yixiao Sun , Edward Schmerling , Marco Pavone

分类：机器人 | 机器学习

2022-12-28

When testing conditions differ from those represented in training data, so-called out-of-distribution (OOD) inputs can mar the reliability of black-box learned components in the modern robot autonomy stack. Therefore, coping with OOD data is an important challenge on the path towards trustworthy learning-enabled open-world autonomy. In this paper, we aim to demystify the topic of OOD data and its associated challenges in the context of data-driven robotic systems, drawing connections to emerging paradigms in the ML community that study the effect of OOD data on learned models in isolation. We argue that as roboticists, we should reason about the overall system-level competence of a robot as it performs tasks in OOD conditions. We highlight key research questions around this system-level view of OOD problems to guide future research toward safe and reliable learning-enabled autonomy.

translated by 谷歌翻译

AVstack: An Open-Source, Reconfigurable Platform for Autonomous Vehicle Development

R. Spencer Hallyburton , Shucheng Zhang , Miroslav Pajic

分类：机器人

2022-12-28

Pioneers of autonomous vehicles (AVs) promised to revolutionize the driving experience and driving safety. However, milestones in AVs have materialized slower than forecast. Two culprits are (1) the lack of verifiability of proposed state-of-the-art AV components, and (2) stagnation of pursuing next-level evaluations, e.g., vehicle-to-infrastructure (V2I) and multi-agent collaboration. In part, progress has been hampered by: the large volume of software in AVs, the multiple disparate conventions, the difficulty of testing across datasets and simulators, and the inflexibility of state-of-the-art AV components. To address these challenges, we present AVstack, an open-source, reconfigurable software platform for AV design, implementation, test, and analysis. AVstack solves the validation problem by enabling first-of-a-kind trade studies on datasets and physics-based simulators. AVstack solves the stagnation problem as a reconfigurable AV platform built on dozens of open-source AV components in a high-level programming language. We demonstrate the power of AVstack through longitudinal testing across multiple benchmark datasets and V2I-collaboration case studies that explore trade-offs of designing multi-sensor, multi-agent algorithms.

translated by 谷歌翻译

Visual Tactile Sensor Based Force Estimation for Position-Force Teleoperation

Yaonan Zhu , Shukrullo Nazirjonov , Bingheng Jiang , Jacinto Colan , Tadayoshi Aoyama , Yasuhisa Hasegawa , Boris Belousov , Kay Hansel , Jan Peters

分类：机器人

2022-12-26

Vision-based tactile sensors have gained extensive attention in the robotics community. The sensors are highly expected to be capable of extracting contact information i.e. haptic information during in-hand manipulation. This nature of tactile sensors makes them a perfect match for haptic feedback applications. In this paper, we propose a contact force estimation method using the vision-based tactile sensor DIGIT, and apply it to a position-force teleoperation architecture for force feedback. The force estimation is done by building a depth map for DIGIT gel surface deformation measurement and applying a regression algorithm on estimated depth data and ground truth force data to get the depth-force relationship. The experiment is performed by constructing a grasping force feedback system with a haptic device as a leader robot and a parallel robot gripper as a follower robot, where the DIGIT sensor is attached to the tip of the robot gripper to estimate the contact force. The preliminary results show the capability of using the low-cost vision-based sensor for force feedback applications.

translated by 谷歌翻译

Capacity Studies for a Differential Growing Neural Gas

P. Levi , P. Gelhausen , G. Peters

分类：神经与进化计算

2022-12-23

In 2019 Kerdels and Peters proposed a grid cell model (GCM) based on a Differential Growing Neural Gas (DGNG) network architecture as a computationally efficient way to model an Autoassociative Memory Cell (AMC) \cite{Kerdels_Peters_2019}. An important feature of the DGNG architecture with respect to possible applications in the field of computational neuroscience is its \textit{capacity} refering to its capability to process and uniquely distinguish input signals and therefore obtain a valid representation of the input space. This study evaluates the capacity of a two layered DGNG grid cell model on the Fashion-MNIST dataset. The focus on the study lies on the variation of layer sizes to improve the understanding of capacity properties in relation to network parameters as well as its scaling properties. Additionally, parameter discussions and a plausability check with a pixel/segment variation method are provided. It is concluded, that the DGNG model is able to obtain a meaningful and plausible representation of the input space and to cope with the complexity of the Fashion-MNIST dataset even at moderate layer sizes.

translated by 谷歌翻译

HINT: Hypernetwork Instruction Tuning for Efficient Zero-Shot Generalisation

Hamish Ivison , Akshita Bhagia , Yizhong Wang , Hannaneh Hajishirzi , Matthew Peters

分类：自然语言处理

2022-12-20

Recent NLP models have the great ability to generalise `zero-shot' to new tasks using only an instruction as guidance. However, these approaches usually repeat their instructions with every input, requiring costly reprocessing of lengthy instructions for every inference example. To alleviate this, we introduce Hypernetworks for INstruction Tuning (HINT), which convert task instructions and examples using a pretrained text encoder into parameter-efficient modules inserted into an underlying model, eliminating the need to include instructions in the model input. Compared to prior approaches that concatenate instructions with every input instance, we find that HINT models are significantly more compute-efficient and consistently outperform these approaches for a given inference budget.

translated by 谷歌翻译

ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning

Olga Golovneva , Moya Chen , Spencer Poff , Martin Corredor , Luke Zettlemoyer , Maryam Fazel-Zarandi , Asli Celikyilmaz

分类：自然语言处理 | 机器学习

2022-12-15

Large language models show improved downstream task performance when prompted to generate step-by-step reasoning to justify their final answers. These reasoning steps greatly improve model interpretability and verification, but objectively studying their correctness (independent of the final answer) is difficult without reliable methods for automatic evaluation. We simply do not know how often the stated reasoning steps actually support the final end task predictions. In this work, we present ROSCOE, a suite of interpretable, unsupervised automatic scores that improve and extend previous text generation evaluation metrics. To evaluate ROSCOE against baseline metrics, we design a typology of reasoning errors and collect synthetic and human evaluation scores on commonly used reasoning datasets. In contrast with existing metrics, ROSCOE can measure semantic consistency, logicality, informativeness, fluency, and factuality - among other traits - by leveraging properties of step-by-step rationales. We empirically verify the strength of our metrics on five human annotated and six programmatically perturbed diagnostics datasets - covering a diverse set of tasks that require reasoning skills and show that ROSCOE can consistently outperform baseline metrics.

translated by 谷歌翻译

Information-Theoretic Safe Exploration with Gaussian Processes

Alessandro G. Bottero , Carlos E. Luis , Julia Vinogradska , Felix Berkenkamp , Jan Peters

分类：机器学习

2022-12-09

We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an a priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown constraint and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. Our approach is naturally applicable to continuous domains and does not require additional hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we explore by learning about the constraint up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.

translated by 谷歌翻译